OpenAI Leverages Google Search Data via Third-Party Scraping to Compete in AI
Microsoft-backed OpenAI has been accessing Google's search results indirectly through SerpApi, a web-scraping service, to enhance ChatGPT's real-time query responses. The arrangement bypasses Google's refusal last year to grant direct API access. While Google has attempted technical countermeasures against scraping, legal action remains off the table amid ongoing antitrust scrutiny.
ChatGPT's hybrid data strategy combines scraped Google results with Microsoft Bing outputs and proprietary web crawling. Tests reveal occasional verbatim similarities between ChatGPT responses and Google's featured snippets, highlighting lingering dependencies on the search giant's infrastructure despite OpenAI's efforts to diversify sources.